Interpretable Sparse Proximate Factors for Large Dimensions
نویسندگان
چکیده
This article proposes sparse and easy-to-interpret proximate factors to approximate statistical latent factors. Latent in a large-dimensional factor model can be estimated by principal component analysis (PCA), but are usually hard interpret. We obtain that easier interpret shrinking the PCA weights setting them zero except for largest absolute ones. show constructed with only 5%–10% of data sufficient almost perfectly replicate population without actually assuming structure or loadings. Using extreme value theory we explain why substitutes non-sparse derive analytical asymptotic bounds correlation appropriately rotated These provide guidance on how construct In simulations empirical analyses financial portfolio macroeconomic data, illustrate close average correlations around 97.5%, while being interpretable.
منابع مشابه
Interpretable sparse SIR for functional data
This work focuses on the issue of variable selection in functional regression. Unlike most work in this framework, our approach does not select isolated points in the definition domain of the predictors, nor does it rely on the expansion of the predictors in a given functional basis. It provides an approach to select full intervals made of consecutive points. This feature improves the interpret...
متن کاملSPINE: SParse Interpretable Neural Embeddings
Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity and latent trends in the data, they are far from being interpretable. We propose a novel variant of denoising k-sparse autoencoders that generates highly ef...
متن کاملInterpretable Sparse High-Order Boltzmann Machines
Fully-observable high-order Boltzmann Machines are capable of identifying explicit highorder feature interactions theoretically. However, they have never been used in practice due to their prohibitively high computational cost for inference and learning. In this paper, we propose an efficient approach for learning a fully-observable high-order Boltzmann Machine based on sparse learning and cont...
متن کاملProvable De-anonymization of Large Datasets with Sparse Dimensions
There is a significant body of empirical work on statistical de-anonymization attacks against databases containing micro-data about individuals, e.g., their preferences, movie ratings, or transaction data. Our goal is to analytically explain why such attacks work. Specifically, we analyze a variant of the Narayanan-Shmatikov algorithm that was used to effectively de-anonymize the Netflix databa...
متن کاملNonlinear Spike-And-Slab Sparse Coding for Interpretable Image Encoding
Sparse coding is a popular approach to model natural images but has faced two main challenges: modelling low-level image components (such as edge-like structures and their occlusions) and modelling varying pixel intensities. Traditionally, images are modelled as a sparse linear superposition of dictionary elements, where the probabilistic view of this problem is that the coefficients follow a L...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Business & Economic Statistics
سال: 2021
ISSN: ['1537-2707', '0735-0015']
DOI: https://doi.org/10.1080/07350015.2021.1961786